Use combine_GTFS_feeds and then get frequent stops¶

Import libraries¶

In [41]:
#%matplotlib inline
import pandas as pd
import geopandas as gpd
import transit_service_analyst as tsa
import combine_gtfs_feeds
import plotly.express as px
import shapely.geometry
import numpy as np
import plotly.io as pio
pio.renderers.default='notebook'
#import nbformat
#import plotly.io as pio

Use combine_gtfs_feeds to merge feeds from the region into one. Note detailed logging. Only needed if you are using more than one feed¶

https://github.com/psrc/combine_gtfs_feeds/wiki/combine_gtfs_feeds-documentation¶

In [42]:
# Only needed if you are using more than one feed.
#C:/temp/gtfs_all is the location of your gtfs folders for each feed
#C:/temp/gtfs_dir is where the merged gtfs files will be written

!combine_gtfs_feeds run -g C:/temp/gtfs_all -s 20180417 -o C:/temp/gtfs_combined
------------------combine_gtfs_feeds Started----------------
GTFS Directory path is: C:\temp\gtfs_all
Output Directory path is: C:/temp/gtfs_combined
Service Date is: 20180417
Adding service_id KPOB-CS:221:0:Weekday:1:18MAR:12345 for feed CT
Adding service_id MCOB-DO:121:0:Weekday:1:18MAR:12345 for feed CT
Adding service_id 1 for feed ET
Adding service_id 98825 for feed KC
Adding service_id 68312 for feed KC
Adding service_id 30857 for feed KC
Adding service_id 104560 for feed KC
Adding service_id 104483 for feed KC
Adding service_id 103237 for feed KC
Adding service_id 102375 for feed KC
Adding service_id 101544 for feed KC
Adding service_id 100579 for feed KC
Adding service_id 95373 for feed KC
Adding service_id 94837 for feed KC
Adding service_id 94127 for feed KC
Adding service_id 92434 for feed KC
Adding service_id 40642 for feed KC
Adding service_id 35200 for feed KC
Adding service_id 34907 for feed KC
Adding service_id 4394 for feed KC
Adding service_id 4317 for feed KC
Adding service_id 4101 for feed KC
Adding service_id 3779 for feed KC
Adding service_id 3600 for feed KC
Adding service_id 3395 for feed KC
Adding service_id 3318 for feed KC
c:\stefan\combine_gtfs_feeds\combine_gtfs_feeds\cli\run.py:483: DtypeWarning: Columns (5) have mixed types. Specify dtype option on import or set low_memory=False.
  df = pd.read_csv(path / gtfs_file_name)
Adding service_id mtwtf for feed KT
Adding service_id mtwtf_PSNS for feed KT
Feed KT contains missing departure/arrival times. Interpolating missing times.
Adding service_id Default-62 for feed PT
Adding service_id WD_SNDR for feed ST
Adding service_id WD_TL for feed ST
Feed ST contains frequencies.txt...
Unique trips will be added to outputs based on headways in frequencies.txt
Adding service_id 20180417 for feed WSF
Finished running combine_gtfs_feeds

Create an instance of transit_service_analyst using the combined gtfs feed created on previous line.¶

In [43]:
path = r'C:/temp/gtfs_combined'
transit_analyst = tsa.load_gtfs(path, '20180417')
C:\Users\scoe\Anaconda3\envs\geopandas\lib\site-packages\transit_service_analyst\gtfs_service.py:116: DtypeWarning:

Columns (5) have mixed types. Specify dtype option on import or set low_memory=False.

C:\Users\scoe\Anaconda3\envs\geopandas\lib\site-packages\geopandas\array.py:275: ShapelyDeprecationWarning:

The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.

C:\Users\scoe\Anaconda3\envs\geopandas\lib\site-packages\geopandas\array.py:275: ShapelyDeprecationWarning:

The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.

Use get_tph_at_stops to get peak frequent service¶

In [44]:
# the list below indicates which hours will be used to define peak service:
list_of_hours = ['hour_6', 'hour_7', 'hour_8', 'hour_15', 'hour_16', 'hour_17']
# minimum trips per hour
min_tph = 6
freq = transit_analyst.get_tph_at_stops() 
# create binary column for each hour
cols = []
for hour in list_of_hours:
    #print (len(freq))
    freq = freq[freq[hour]>=min_tph]
freq['frequent_sum'] = freq[list_of_hours].sum(axis=1)



print (f'There are {len(freq)} stops that have {min_tph} trips or more per hour.')
There are 996 stops that have 6 trips or more per hour.

Convert to GeoDataFrame¶

In [45]:
# join to stops
freq = freq.merge(transit_analyst.stops, how = 'left', on = 'stop_id')
# convert to GeoDataFrame:
gdf = gpd.GeoDataFrame(freq, geometry=gpd.points_from_xy(freq.stop_lon, freq.stop_lat))
C:\Users\scoe\Anaconda3\envs\geopandas\lib\site-packages\geopandas\array.py:275: ShapelyDeprecationWarning:

The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.

Map frequent stops. Size of stop determined by total number of trips.¶

In [46]:
fig = px.scatter_mapbox(freq,
                        lat=freq['stop_lat'],
                        lon=freq['stop_lon'],
                        hover_name="stop_id",
                        zoom=8,
                        mapbox_style="open-street-map",
                        size = 'frequent_sum')

fig.show()

Buffer Frequent Stops¶

In [47]:
gdf = gpd.GeoDataFrame(freq, geometry=gpd.points_from_xy(freq.stop_lon, freq.stop_lat))
# set initial geographic coordinate system.
gdf =gdf.set_crs(4326)
# need to switch to a projected system (state plane) to do buffer.
gdf = gdf.to_crs(2285)
gdf.geometry = gdf.geometry.buffer(1320)
# switch back to map
gdf = gdf.to_crs(4326)
gdf.set_index('stop_id', inplace = True)

fig = px.choropleth_mapbox(gdf,
                           geojson=gdf.geometry,
                           locations=gdf.index,
                           color="frequent_sum",
                           center={"lat": 47.6, "lon": -122.7073},
                           mapbox_style="open-street-map",
                           zoom=8)
fig.show()
C:\Users\scoe\Anaconda3\envs\geopandas\lib\site-packages\geopandas\array.py:275: ShapelyDeprecationWarning:

The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.

Export buffers to shapefile¶

In [48]:
gdf.to_file(r'C:/temp/buff_stops.shp')
C:\Users\scoe\AppData\Local\Temp\ipykernel_28764\2143338697.py:1: UserWarning:

Column names longer than 10 characters will be truncated when saved to ESRI Shapefile.